{36 () Selective Sampling for Nearest Neighbor Classiiers *

نویسندگان

  • MICHAEL LINDENBAUM
  • SHAUL MARKOVITCH
چکیده

Most existing inductive learning algorithms assume the availability of a training set of labeled examples. In many domains, however, labeling the examples is a costly process that requires either intensive computation or manual labor. In such cases, it may be beneecial for the learner to be active by intelligent selection of examples for labeling with the goal of reducing the labeling cost. In this paper we propose a lookahead algorithm for selective sampling of examples for nearest-neighbors classiiers. The algorithm attempts to nd the example with the highest utility considering its eeect on the resulting classiier. Computing the expected utility of an example requires estimating the probability of the possible labels. We propose to use the random eld model for this estimation. The proposed selective sampling algorithm was evaluated empirically on real and artiicial data sets. The experiments show that the proposed algorithm outperforms other methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Selective Sampling for Nearest Neighbor Classi ers

In the passive, traditional, approach to learning, the information available to the learner is a set of classiied examples, which are randomly drawn from the instance space. In many applications, however, the initial clas-siication of the training set is a costly process, and an intelligently selection of training examples from unla-beled data is done by an active learner. This paper proposes a...

متن کامل

Combining Nearest Neighbor Classi ers Through Multiple

Combining multiple classiiers is an eeective technique for improving accuracy. There are many general combining algorithms, such as Bagging or Error Correcting Output Coding, that signiicantly improve classiiers like decision trees, rule learners, or neural networks. Unfortunately, many combining methods do not improve the nearest neighbor classiier. In this paper, we present MFS, a combining a...

متن کامل

Consistency of Nearest Neighbor Classification under Selective Sampling

This paper studies nearest neighbor classification in a model where unlabeled data points arrive in a stream, and the learner decides, for each one, whether to ask for its label. Are there generic ways to augment or modify any selective sampling strategy so as to ensure the consistency of the resulting nearest neighbor classifier?

متن کامل

Nearest neighbor classification from multiple feature subsets

Combining multiple classiiers is an eeective technique for improving accuracy. There are many general combining algorithms, such as Bagging, Boosting, or Error Correcting Output Coding, that signiicantly improve classiiers like decision trees, rule learners, or neural networks. Unfortunately, these combining methods do not improve the nearest neighbor classiier. In this paper, we present MFS, a...

متن کامل

Evaluation Accuracy of Nearest Neighbor Sampling Method in Zagross Forests

Collection of appropriate qualitative and quantitative data is necessary for proper management and planning. Used the suitable inventory methods is necessary and accuracy of sampling methods dependent the inventory net and number of sample point. Nearest neighbor sampling method is a one of distance methods and calculated by three equations (Byth and Riple, 1980; Cotam and Curtis, 1956 and Cota...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999